Efficient supervised optimum-path forest classification for large datasets
نویسندگان
چکیده
Data acquisition technologies can provide large datasets with millions of samples for statistical analysis. This creates a tremendous challenge for pattern recognition techniques, which need to be more efficient without loosing their effectiveness. We have tried to circumvent the problem by reducing it into the fast computation of an optimum-path forest (OPF) in a graph derived from the training samples. In this forest, each class may be represented by multiple trees rooted at some representative samples. The forest is a classifier which assigns to any new sample the label of its most strongly connected root. This methodology has been successful with different graph topologies and learning techniques. In this work we have focused on one of the supervised approaches, which has offered considerable advantages over Support Vector Machines and Artificial Neural Networks to handle large datasets. We propose (i) a new algorithm that speeds up classification and (ii) a solution to reduce the training set size with negligible effects on the accuracy of classification, further increasing its efficiency. Experimental results show the improvements with respect to our previous approach and advantages over other existing methods, which make the new method a valuable contribution for large dataset analysis.
منابع مشابه
Supervised Pattern Classification Using Optimum-Path Forest
We present a graph-based framework for pattern recognition, called Optimum-Path Forest (OPF), and describe one of its classifiers developed for the supervised learning case. This classifier does not require parameters and can handle some overlapping among multiple classes with arbitrary shapes. The method reduces the pattern recognition problem into the computation of an optimum-path forest in ...
متن کاملOptimum-Path Forest: A Novel and Powerful Framework for Supervised Graph-based Pattern Recognition Techniques
We present here a novel framework for graph-based pattern recognition techniques called Optimum-Path Forest (OPF), which has been demonstrated to be superior than traditional supervised pattern recognition techniques, such as Artificial Neural Networks using Multilayer Perceptrons and Support Vector Machines, in terms of both accuracy and execution times. The OPF-based classifiers model the pro...
متن کاملLand Use Classification Using Optimum-Path Forest
It was introduced in this paper the Optimum-Path Forest for land use classification aiming a better environmental management, using images obtained from CBERS 2B CCD satellite covering the area of the Rio das Pedras watershed, Itatinga City, São Paulo State, Brazil. We also compared the Optimum-Path Forest algorithm with the well known supervised classifiers: Artificial Neural Networks using Mu...
متن کاملRecent advances on optimum-path forest for data classification: supervised, semi-supervised and unsupervised learning
Although one can find several pattern recognition techniques out there, there is still room for improvements and new approaches. In this book chapter, we revisited the Optimum-Path Forest (OPF) classifier, which has been evaluated over the last years in a number of applications that consider supervised, semi-supervised and unsupervised learning problems. We also presented a brief compilation of...
متن کاملOptimum Path Forest Approach for Image Retrieval based on Context
CBIR System consist of large datasets with millions of image samples for statistical analysis, hence putting tremendous challenge for pattern recognition techniques, which needs to be more efficient without compromising effectiveness. The image samples are stored in a database in the form of feature vectors. Pattern Recognition Technique requires a high computational burden for learning the dis...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition
دوره 45 شماره
صفحات -
تاریخ انتشار 2012